

			    An Lsof API

    This lsof API definition is based on obtaining the information
    lsof currently delivers in its /dev/kmem configuration.  There
    may be better ways to get the necessary information, and I'll
    try to indicate where options are possible.

    I've not tried to determine if any current API already provides
    any of the following information.

    Lsof needs information on processes and their open files, so
    I'll divide the API discussion into those two sections, and
    add a section on how lsof might be provided path name information.

    Process Information
    ===================

    For each process lsof needs the following information:

	The name of the command executing for the process.

	The process PID.

	The process parent PID.

	The process group ID.

	The UID of the owner of the process.

	All open file information on:

	    The current working directory

	    The root directory

	    The executing text file

	    Share libraries

	    Any memory-mapped files

    Open File Information
    =====================

	For each open file, lsof needs this information:

	    File table address -- this information is used for
		checking for file descriptor leaks, for example.
		Any sort of unique ID that would enable that check
		would suffice.

	    Current file position

	    Number of open instances of the file

	    File size

	    Link count

	    Device (raw and cooked) major and minor numbers

	    File system identification -- mounted-on directory
		and mounted-on device name

	    Open state -- read, write, or read/write

	    File type -- e.g., regular, directory, fifo, pipe,
		socket, etc.

	    Vnode address -- this information is used in two ways:
		1) to identify files shared by processes; and 2) to
		locate path name components in the kernel name cache.

		Sometimes the vnode has a "capability" ID that connects
		it to the cache entry.  Lsof might need that.

		A unique identifier could satisfy the first need.
		A full path name could satisfy the second.

	    Node number -- CD node number, NFS node number, inode
		number, etc.

	    Full path names or a way to get path names from the
		kernel's name cache

	    For sockets:

		Protocol, including AF_*, PF_*, and IPPROTO_*
		    values, IPv4 or IPv6, etc.

		Local and remote IP (v4 or v6) addresses and ports

		Unix domain information -- peer ID, file system
		    object (path name, device, node number)

	    For pipes:

		Pipe ID

		Peer ID

		Pipe status -- read or write, amount in the pipe, etc.


    Path Names

	Lsof currently reports path names by using vnode addresses to
	assemble their components from the kernel name cache.

	I see two options here: 1) report full path names directly
	in the API; or 2) provide an API that delivers the kernel
	name cache data to lsof that it can connect to open files
	-- e.g., via vnode address, or (device,node_number) doublet.

	Here's what lsof extracts from the kernel name cache that
	an API would need to supply:

	    A connection between open files and cache entries, such
		a vnode address or a (device,node_number) doublet

	    The path entry name component -- e.g., the "stat.h"
		part of "/usr/include/sys/stat.h".

	    A connection from the cache entry to its immediate
		directory parent -- e.g., from stat.h to sys.
		In this cache this is usually done with a vnode
		pointer.

	    Any unique ID (i.e., a "capability" ID from the vnode)
		that might distinguish the cache entry.


    Vic Abell <abe@purdue.edu>
    March 18, 1999
